Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML][Transforms] remove force flag from _start #46414

Merged

Conversation

benwtrent
Copy link
Member

It no longer makes sense to include a force flag for our _start api.

If a failure occurs, a much safer path is to _stop?force=true and then call _start again. This allows for better clean up of resources and safer run-time guarantees.

The flag was not ever added to our docs or the HLRC.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

ActionListener<StartDataFrameTransformAction.Response> startTaskListener = ActionListener.wrap(
response -> logger.info("[{}] successfully completed and scheduled task in node operation", transformId),
failure -> {
auditor.error(transformId, "Failed to start data frame transform. " +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed while troubleshooting, that if we fail on start within the executor, the user could never be the wiser. We should at least audit here.

I will experiment to see if setting the task to failed here is OK.

They will not be able to call _start again as the task already exists, it just won't be flagged as FAILED.
Maybe we want to simply audit and call .stop on the task?

Additionally, this type of failure is present in 7.4. We may want to at least add an audit message there before release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding an audit for 7.4.1 sounds good to me.

As for "If anything goes wrong within startTask" - I think marking the task as failed and setting a reason is the preferable solution.

@@ -219,10 +218,17 @@ public void getCheckpointingInfo(DataFrameTransformsCheckpointService transforms
));
}

// Here `failOnConflict` is usually true, except when the initial start is called when the task is assigned to the node
synchronized void start(Long startingCheckpoint, boolean force, boolean failOnConflict, ActionListener<Response> listener) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get this right?

Due to this simplification we do not need #46347 anymore and revert it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty much. Since the ONLY thing that can call start against the task is the node executor, we know that it should be allowed to call start against a started task.

ActionListener<StartDataFrameTransformAction.Response> startTaskListener = ActionListener.wrap(
response -> logger.info("[{}] successfully completed and scheduled task in node operation", transformId),
failure -> {
auditor.error(transformId, "Failed to start data frame transform. " +

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding an audit for 7.4.1 sounds good to me.

As for "If anything goes wrong within startTask" - I think marking the task as failed and setting a reason is the preferable solution.

@benwtrent
Copy link
Member Author

@elasticmachine update branch

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hendrikmuhs
Copy link

@elasticmachine test this

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@hendrikmuhs
Copy link

run elasticsearch-ci/docs

@benwtrent benwtrent merged commit 479ebd1 into elastic:master Sep 16, 2019
@benwtrent benwtrent deleted the feature/ml-transforms-remove-_start-force branch September 16, 2019 14:02
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Sep 16, 2019
* [ML][Transforms] remove `force` flag from _start

* fixing expected error message
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Sep 16, 2019
benwtrent added a commit that referenced this pull request Sep 16, 2019
* Muting tests for PR #46414

* changing bwc version
benwtrent added a commit that referenced this pull request Sep 18, 2019
* [ML][Transforms] remove `force` flag from _start (#46414)

* [ML][Transforms] remove `force` flag from _start

* fixing expected error message

* adjusting bwc version
walterra added a commit to elastic/kibana that referenced this pull request Oct 2, 2019
The force parameter was removed from start actions in elastic/elasticsearch#46414. This reflects the change in the UI API calls.
walterra added a commit to walterra/kibana that referenced this pull request Oct 2, 2019
The force parameter was removed from start actions in elastic/elasticsearch#46414. This reflects the change in the UI API calls.
walterra added a commit to elastic/kibana that referenced this pull request Oct 2, 2019
The force parameter was removed from start actions in elastic/elasticsearch#46414. This reflects the change in the UI API calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants